Margin-based Feature Selection Techniques for Support Vector Machine Classification

نویسندگان

Yaman Aksu

David J. Miller

George Kesidis

چکیده

Feature selection for classification working in high-dimensional feature spaces can improve generalization accuracy, reduce classifier complexity, and is also useful for identifying the important feature “markers”, e.g., biomarkers in a bioinformatics or biomedical context. For support vector machine (SVM) classification, a widely used feature selection technique is recursive feature elimination (RFE). In recent work, we demonstrated that the RFE objective is not generally consistent with the margin maximization objective that is central to the SVM learning approach. We thus proposed explicit margin-based feature elimination (MFE) for SVMs and demonstrated both improved margin and improved generalization accuracy, compared with RFE for the case of linear SVMs. In this paper, after reviewing MFE, we first introduce an extension which achieves further gains in margin at small computational cost. This extension solves the SVM optimization problem to maximize the classifier’s margin at each feature elimination step, albeit in a lightweight fashion by optimizing only two degrees of freedom – the weight vector’s slope and intercept. We next consider the case of a nonlinear kernel. We show that RFE defined for the nonlinear kernel case assumes that the weight vector length is strictly decreasing as features are eliminated. We demonstrate experimentally that this assumption is not in general valid for the Gaussian kernel and that, consequently, RFE may give poor results in this case. An extension of MFE for the nonlinear kernel case gives both better margin and generalization accuracy. This approach may help nonlinear kernel SVMs to avoid overfitting and, thus, to achieve better results than linear SVMs in some high-dimensional domains where use of nonlinear kernels has not to date been found very favorable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets

Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...

متن کامل

Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran

Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine

Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods. In filter methods, features subsets are selected due to some measu...

متن کامل

Modeling and design of a diagnostic and screening algorithm based on hybrid feature selection-enabled linear support vector machine classification

Background: In the current study, a hybrid feature selection approach involving filter and wrapper methods is applied to some bioscience databases with various records, attributes and classes; hence, this strategy enjoys the advantages of both methods such as fast execution, generality, and accuracy. The purpose is diagnosing of the disease status and estimating of the patient survival. Method...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Margin-based Feature Selection Techniques for Support Vector Machine Classification

نویسندگان

چکیده

منابع مشابه

Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets

Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine

Modeling and design of a diagnostic and screening algorithm based on hybrid feature selection-enabled linear support vector machine classification

عنوان ژورنال:

اشتراک گذاری